Knowledge Discovery in Variant Databases Using Inductive Logic Programming

نویسندگان

  • Hoan Nguyen
  • Tien-Dao Luu
  • Olivier Poch
  • Julie D. Thompson
چکیده

Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge Discovery in databases - An Inductive Logic Programming Approach

The need for learning from databases has increased along with their number and size. The new eld of Knowledge Discovery in Databases (KDD) develops methods that discover relevant knowledge in very large databases. Machine learning, statistics, and database methodology contribute to this exciting eld. In this paper, the discovery of knowledge in the form of Horn clauses is described. A case stud...

متن کامل

Knowledge Discovery from Structured Mammography Reports Using Inductive Logic Programming

The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subse...

متن کامل

A Logical Framework for Frequent Pattern Discovery in Spatial Data

In recent times, several extensions f data mining methods and techniques have been explored aiming at dealing with advanced databases. Many promising applications of inductive logic programming (ILP) to knowledge discovery in databases have also emerged inorder to benefit from semantics andinference rules of first-order logic. Inthis paper, an ILP framework forfrequent pattern discovery in spat...

متن کامل

The Parallelization of a Knowledge Discovery System with Hypergraph Representation

Knowledge discovery is a time-consuming and space intensive endeavor. By distributing such an endeavor, we can diminish both time and space. System INDED(pronounced \indeed") is an inductive implementation that performs rule discovery using the techniques of inductive logic programming and accumulates and handles knowledge using a deductive nonmonotonic reasoning engine. We present four schemes...

متن کامل

SPADA: A Spatial Association Discovery System*

This paper presents a spatial association discovery system, named SPADA, which has been developed according to the theoretical framework of inductive databases. Our approach considers inductive databases as deductive databases with an integrated inductive component and relies on techniques borrowed from the field of Inductive Logic Programming (ILP). In SPADA, an ILP module supports the process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2013